Discriminatively Trained i-vector Extractor for Speaker Verification
نویسندگان
چکیده
We propose a strategy for discriminative training of the ivector extractor in speaker recognition. The original i-vector extractor training was based on the maximum-likelihood generative modeling, where the EM algorithm was used. In our approach, the i-vector extractor parameters are numerically optimized to minimize the discriminative cross-entropy error function. Two versions of the i-vector extraction are studied—the original approach as defined for Joint Factor Analysis, and the simplified version, where orthogonalization of the i-vector extractor matrix is performed.
منابع مشابه
SVMSVM: support vector machine speaker verification methodology
Support vector machines with the Fisher and score-space kernels are used for text independent speaker verification to provide direct discrimination between complete utterances. This is unlike approaches such as discriminatively trained Gaussian mixture models or other discriminative classifiers that discriminate at the frame-level only. Using the sequence-level discrimination approach we are ab...
متن کاملSpeaker Verification Under Adverse Conditions Using i-Vector Adaptation and Neural Networks
The main challenges introduced in the 2016 NIST speaker recognition evaluation (SRE16) are domain mismatch between training and evaluation data, duration variability in test recordings and unlabeled in-domain training data. This paper outlines the systems developed at CRIM for SRE16. To tackle the domain mismatch problem, we apply minimum divergence training to adapt a conventional i-vector ext...
متن کاملAdaptation transforms of auto-associative neural networks as features for speaker verification
We present a new approach of using Auto-Associative Neural Networks (AANNs) in the conventional GMM speaker verification framework with i-vector feature extraction and PLDA modeling. In this technique, an i-vector feature extractor is trained using adaptation parameters from a mixture of AANNs. In order to model parts of each speaker’s acoustic space, a training objective function based on post...
متن کاملComparison between supervised and unsupervised learning of probabilistic linear discriminant analysis mixture models for speaker verification
We present a comparison of speaker verification systems based on unsupervised and supervised mixtures of probabilistic linear discriminant analysis (PLDA) models. This paper explores current applicability of unsupervised mixtures of PLDA models with Gaussian priors in a total variability space for speaker verification. Moreover, we analyze the experimental conditions under which this applicatio...
متن کاملCombining deep speaker specific representations with GMM-SVM for speaker verification
This study combines a Gaussian mixture model support vector machine (GMM-SVM) system with a nonlinear feature transformation, discriminatively trained to extract speaker specific features from MFCCs. Separation of the speaker information component and non-speaker related information in the speech signal is accomplished using a regularized siamese deep network (RSDN). RSDN learns a hidden repres...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2011